-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bbox bottom right corner calculation #314
base: master
Are you sure you want to change the base?
Conversation
@nwojke can you, please, take a look? |
The boxes are given in (top left x, top left y, width, height) format. In order to get bottom right corner we need to add width and height to x and y correspondingly and subtract 1.
01d2085
to
a85801f
Compare
@nwojke any luck you can take a look? |
@nwojke can you, please, take a look? |
As far as I know W = x2 - x1 and H = y2 -y1. According to that x2(right) = x1(Let) + W not "x2(right) = x1(Let) + W -1". Maybe you are changing your bboxes in other parts of your implementation. |
Hello @SajjadPSavoji Thank you for the provided feedback. I really appreciate it.
Check out the places I was highlighting in the original comment. I agree, the problem seems a bit tricky. There are a few inconsistencies around this place. The original comment on the line 32 says bbox : array_like
The bounding box in format (x, y, width, height). I.e. if you have a one pixel box, let's say (0, 0, 1, 1), then it seems it is occupying the only pixel at the position 0,0. Both ways of handling this box would be correct: either consider the right and bottom corners of it in inclusive manner or exclusive one. Though, the lines 57 and 62 seem inconsistent to each other. The viewport on line 62 is considered originally as (0, 0, w, h). Right? # clip at image boundaries
bbox[:2] = np.maximum(0, bbox[:2])
bbox[2:] = np.minimum(np.asarray(image.shape[:2][::-1]) - 1, bbox[2:]) And the box is clipped by (0, 0, w - 1, h - 1). At line 57 we don't do the same thing. # convert to top left, bottom right
bbox[2:] += bbox[:2]
bbox = bbox.astype(np.int) Imagine we have a bounding box (0, 0, viewport_width, viewport_height). Consider those are the width and height of the bounding box (from the pydoc) we trim them at line 62, and the box becomes (0, 0, viewport_width - 1, viewport_height - 1). At the same time, there is nothing wrong with the box having the same width and height as the viewport as this box will be the box with the size of the whole image and it seems it should not be clipped in that case. As I said before, the inconsistency seems rather minor. However, to fix the issue, we can do a different change, for example, instead of changing boxes handling we can change the viewport handling in here. Something similar to this bbox[2:] = np.minimum(np.asarray(image.shape[:2][::-1]), bbox[2:]) Hopefully, the explanation seems reasonable. What do you think? P.S.: regardless of the way to fix the issue, I think without receiving the initial maintainers feedback there is nothing we can do about it. |
The boxes are given in (top left x, top left y, width, height) format. In order to get bottom right corner we need to add width and height to x and y correspondingly and subtract 1.
On the line 62 in
tools/generate_detections.py
you treat image rect correctly by subtracting 1 from the image box, which is originally (0, 0, w, h) and after subtracting (0, 0, w - 1, h - 1).however, you don't do it in other places across the repo, for example, on line 57 in the same file
By not doing so, you get a pixel that is diagonally to the bottom right corner. This minor inconsistency is really confusing.
There are 2 ways of how to resolve it:
rect
calculation across the codebase (subtract 1 from other rects when converted to (left, top, right, bottom))rect
consistent to otherrects
and do not subtract 1 for the image right bottom cornerI believe the first approach is more reasonable to take as it also fixes how
iou
is calculated. Below is an example of how one can figure out the right bottom corner of arect
:This is not a critical issue and should not dramatically affect metrics though would be really nice to fix.